HBase and Hypertable for large scale distributed storage systems A Performance evaluation for Open Source BigTable Implementations
نویسندگان
چکیده
BigTable is a distributed storage system developed at Google for managing structured data and has the capability to scale to a very large size: petabytes of data across thousands of commodity servers. As now, there exist two open-source implementations that closely emulate most of the components of Google’s BigTable i.e. HBase and Hypertable. HBase is written in Java and provides BigTable like capabilities on top of Hadoop. Hypertable is developed in C++ and is compatible with multiple distributed file systems. Both HBase and Hypertable require a distributed file system like Google File System (GFS) and the comparison therefore also takes into account the architectural differences in the available implementations of GFS like systems. This paper provides a view of the capabilities of each of these implementations of BigTable, and should help those trying to understand their technical similarities, differences, and capabilities.
منابع مشابه
Hbase - non SQL Database, Performances Evaluation
HBase is the open source version of BigTable distributed storage system developed by Google for the management of large volume of structured data. HBase emulates most of the functionalities provided by BigTable. Like most non SQL database systems, HBase is written in Java. The current work’s purpose is to evaluate the performances of the HBase implementation in comparison with SQL database, and...
متن کاملBig Data in the Cloud: A Survey
Big Data has become a hot topic across several business areas requiring the storage and processing of huge volumes of data. Cloud computing leverages Big Data by providing high storage and processing capabilities and enables corporations to consume resources in a pay-as-you-go model making clouds the optimal environment for storing and processing huge quantities of data. By using virtualized re...
متن کاملEnhancing Data Processing on Clouds with Hadoop/HBase
In the current information age, large amounts of data are being generated and accumulated rapidly in various industrial and scientific domains. This imposes important demands on data processing capabilities that can extract sensible and valuable information from the large amount of data in a timely manner. Hadoop, the open source implementation of Google’s data processing framework (MapReduce, ...
متن کاملScalable Inverted Indexing on NoSQL Table Storage
The development of data intensive problems in recent years has brought new requirements and challenges to storage and computing infrastructures. Researchers are not only doing batch loading and processing of large scale of data, but also demanding the capabilities of incremental updates and interactive analysis. Therefore, extending existing storage systems to handle these new requirements beco...
متن کاملFast graph mining with HBase
Mining large graphs using distributed platforms has attracted a lot of research interests. Especially, large graph mining on Hadoop has been researched extensively, due to its simplicity and massive scalability. However, the design principle of Hadoop to maximize scalability often limits the efficiency of the graph algorithms. For this reason, the performance of graph mining algorithms running ...
متن کامل